b'        ~\n             \\L       ~----~\n                     .:-G::~.7i.l\n\n        NATIONAL.\n       ARCHIVES\n      OFFICE of \n\n INSPECTOR GENERAL \n\n\n\nDate     :   February 21,2012\n\nReply to \n\nAttn of :    Office ofInspector General (OIG) \n\n\nSubject:     Management Letter No. 12-06, Access to Records in the Base Electronic Records\n             Archive System\n\nTo       :   David S. Ferriero, Archivist of the United States (N)\n\nOn January 19,2012, the OIG office met with NARA officials to discuss access to unstructured\nrecords stored in the Electronic Records Archives (ERA). The purpose of this memorandum is\nto formally bring to your attention conditions which impact NARA\'s ability to provide access to\nthe essential evidence documenting the rights of citizens, the actions of Federal officials, and the\nnational experience. These conditions are further amplified by the Presidential Memorandum on\nManaging Government Records which highlights the importance of electronic records, and the\nneed for their effective management.\n\nThe ERA system was to enable NARA to realize its strategic mission with regards to electronic\nrecords, and business transactions associated with NARA records. The NARA OIG has issued\nnumerous reports on the ERA program which identified cost overruns, failed deliverables, and\nfunctional deficiencies. These prior reports were issued while NARA was in a developmental\nproject phase with the prime contractor, Lockheed Martin (LM). The ERA program is now in an\nOperations & Maintenance (O&M) phase with constrained funding. Specific to the base ERA\nsystem (excluding the Presidential and Congressional "instances"), what was developed at a cost\nof over $383 million is a shell ofthat which was envisioned at program inception, and one that\nfails to meet even the most basic of requirements initially defined in the contract.\n\nThe following outlines some of the primary functional failings or deficiencies existing in ERA:\n\n       \xe2\x80\xa2 \t There are no reliable automated tools, or staff, available to validate that Federal records\n           transferred by originating agencies to be ingested into base ERA are free of national\n           security classified content. NARA officials involved in this process are aware this will\n           create an unmanageable backlog of records which are not even ingested into ERA.\n           Indeed, it is assumed a large volume of records will have to be returned to the originating\n           agency for review when the potential for classified content is detected (even if it is a\n           false-positive). NARA officials acknowledge this may delay public access for many\n           years.\n\n\n\n                                     NARA\'s web site is http://www.nara.gov\n\x0c   \xe2\x80\xa2 \t The same problems related to screening for classified information exist for personally\n       identifiable information (PH). There is no reliable automated tool for screening for PH,\n       and the manual process proposed is too labor intensive to be feasible. Even if minimal\n       PH is found in a record, base ERA has no redaction capability to provide the public a\n       releasable version of the record.\n\n   \xe2\x80\xa2 \t Unstructured records ingested into base ERA would be subject to archival treatment\n       consistent with that afforded existing paper records. This includes such things as\n       formulating records descriptions, finding aids, etc. It would be manual and labor\n       intensive. However, no meaningful additional staff resources are, or will be, available to\n       NARA in the foreseeable future to accomplish this work.\n\n   \xe2\x80\xa2 \t The public will only be able to search for records in base ERA by contacting an Archivist\n       (thus replicating a paper process in this the digital age), unless special treatment has been\n       afforded to a record by having it moved from base ERA to the On-Line Public Access\n       (OP A) system. OPA is not a part of ERA and currently houses just a test sample of 28\n       files of base ERA data. There is currently no means nor process to ingest or transfer\n       meaningful quantities of records from base ERA to OP A.\n\n   \xe2\x80\xa2 \t The public will not have efficient access to the unstructured records, such as emails or\n       word processing files, housed in base ERA. The vast majority of records will not be\n       afforded special treatment and copied into OP A. Those files which remain in base ERA\n       are not full-text searchable, even by NARA archivists. Thus, even if a researcher knows\n       an exact phrase used in a Word document, they will still have to use a search process\n       replicating the paper system to retrieve many possible records and manually search them.\n\n   \xe2\x80\xa2 \t Notwithstanding the previous bullets, should an unstructured base ERA record be\n       identified and made available to a requester, the record would be in the format or\n       language it was created in, not as envisioned as an "asset" independent of the hardware or\n       software in which it was created. Thus the recipient may not be able to readily view the\n       document as it may be in a computer code or language no longer readily available.\n\n   \xe2\x80\xa2 \t An over the horizon system which would afford users direct access from any location is\n       now one in which (unless the file can be downloaded) a NARA archivist must be\n       contacted to search for a record, and a fee structure or "cost recovery fee" for born-digital\n       records has been defined. The current fees are set at $15 per file when the order contains\n       10 or fewer files, and $13 per file for orders of 11 or more files.\n\nIn plain text, the ERA system at "full operating capability" and at the OMB mandated end of the\ndevelopmental phase (September 30, 2011) is neither functional nor operational in meeting the\nbasic needs and requirements of our users, the American people. Records that are "clean" and\nscheduled to afford the public access cannot be readily identified, located and presented to\ninterested parties.\n\n\n\n\n                                               Page 2 \n\n                                NARA\'s web site is http://www.nara.gov \n\n\x0cWe believe it is imperative that NARA\'s stakeholders be provided clear and unfiltered\ninformation as to the status of ERA. Per the ERA Base User Liaison, the last revision of the\n"ERA Misconceptions and Facts" page on Archives.gov was in April 2011. On January 9, 2012\neight questions were transmitted by the Inspector General to the Chief Information Officer\n(CIa). In the correspondence, the CIa was invited to "share these questions with your\ncolleagues as necessary. Please note for all of these questions we are only referring to records\nthe public has the right to view with appropriate redactions (PH, privacy, etc.)." The responses\nto the eight questions are included in their entirety in Attachment A.\n\nThe OIG plans to perform a series of audits on the life cycle of electronic records in base ERA,\nfrom their ingestion through preservation and dissemination to customers, during FY 2012 which\nwill focus on the conditions mentioned above. If you have any questions concerning the\ninformation presented in this Management Letter, please contact me at (301) 837-1532.\n\n\n\n/--?/\'i///\nPaul Brachfeld      ""       /   \xe2\x80\xa2. \n\nInspector General \n\n\n\n\n\n                                                Page 3 \n\n                                 NARA\'s web site is http://www.nara.gov \n\n\x0cAttachment A \n\n\nQuestion 1.\nI am interested in unstructured records (word processing files, email files, etc.) which are being\ninitially ingested into base ERA. I am not referencing unstructured records which were\nincorporated into AAD, ARC or other legacy systems first before being transferred over to base\nERA; or those records selected from base ERA and copied back over to legacy systems so that\nOP A can search them. How can researchers access and explore unstructured records which have\nonly been ingested into base ERA at the present time, and do not reside in any other form or\nsystem?\n\nResponse: We never envisioned, nor do we ever intend, that public users would access records in\nBase. Because many records in Base contain sensitive information exempt under FOIA, public\naccess is not permissible. Rather, following archival screening and any necessary withdrawal or\nredaction, public access to ERA records will be through OPA. We view the ERA Base repository\nas analogous to the stacks, where researchers cannot roam freely to access what they choose. As\nwith all archival research, researchers start with a search of record descriptions in ARC/OP A. If\nthe researcher finds descriptions of records of interest but the records themselves are not\navailable online as part of the description, the researcher contacts the electronic records reference\nstaff to request access (including obtaining copies of) the records in question. This process is\nparallel to the process researchers use to get any records in physical format. Therefore, if a\nresearcher is interested in a series containing thousands of PDFs and there is no index to\ndocuments, the user could obtain copies of all of the PDFs (at a fee) to browse. In addition, a\nresearcher will be able to "explore" any set of unstructured electronic records online once the\nrecords have been transferred into OPA.\n\nQuestion 2.\nIf researchers can access and explore these files, what is the methodology a researcher would\nfollow? I know for specific structured data-files requested from base a NARA archivist can\npopulate media and transmit it to a requestor? Thus if a researcher already knows a specific file\nwas ingested into ERA, and knows accession number or other unique identifier, then the file can\nbe pulled. What I want to know is what does a researcher do who only knows they are interested\nin a topic and is looking for files on that topic do? If there is a methodology, please provide an\nillustrative case where this has occurred (remember I am interested in unstructured records that\ndid not already reside in one of legacy systems and is not from the Presidential or Congressional\ninstance).\n\nResponse: See response to Question 1.\n\nQuestion 3.\nCurrently, there is a format and structured followed for researchers to access paper records which\nhappens all the time in our research rooms across the country. How does/will that process play\n\n\n\n\n                                               Page 4 \n\n                                NARA\'s web site is http://www.nara.gov \n\n\x0cout in the very near future with records ingested into base ERA that have yet to be uploaded into\nthe public interface (OPA)? Is that process and capability defined and can you provide it to us?\n\nResponse: As with all archival research, researchers start with a search of record descriptions in\nARC/OPA. If the researcher finds descriptions of records of interest but the records themselves\nare not available online as part of the description, the researcher contacts the electronic records\nreference staff to access (including obtaining copies of) the records in question This process is\nparallel to the process researchers use to get any records in physical formats. If a researcher is\ninterested in a series containing thousands ofPDFs and there is no index to documents, the user\ncould obtain copies of all of the PDFs (at a fee) to browse. The process of providing reference\nservice for electronic records is defined in SOPs which we will be happy to share.\n\nQuestion 4.\nA) What is the volume of unstructured records expected to be ingested into base ERA over the\nnext few months, years etc., and\nB) do we have a definitive described process and capability to respond to researchers requests for\nthese digital records?\nC) Do we have an estimate as to how many requests we are to have for unstructured records that\nare coming to the base ERA system from Federal agencies?\n\nResponse:\nA) We do not have specific volume numbers for future transfers of unstructured records. We\nhave estimates of e-records in any format that we expect to be transferred over the short term but\nthese are not broken down by structured and unstructured and we do not know exactly when they\nwill come in.\nB) As with all archival research, researchers start with a search of record descriptions in\nA RC/OPA. If the researcher finds descriptions of records of interest but the records themselves\nare not available online as part of the description, the researcher contacts the electronic records\nreference staff to access (including obtaining copies of) the records in question This process is\nparallel to the process researchers use to get any records in physical formats. The electronic\nrecords reference unit has SOPs for responding to requests, which they would be happy to share.\nC) We cannot estimate future reference demand. Based on Electronic Records Section staff\nexperience, since fiscal year 2011 , out of the approximately 2000 research inquiries received,\nthere have been maybe a dozen at most inquiries for unstructured electronic records (PDFs,\nemails). As best as staff can recollect, only one inquiry was for unstructured electronic records\nthat were only in Base ERA\n\nQuestion 5.\nIf additional resources and funds will be required to facilitate the public\'s access to records\ningested into base ERA has this been defined and communicated to our oversight committees\nand OMB? If so, please provide.\n\nResponse: We\'ve talked to OMB about our needs through 2013 and presented projections for the\nexhibit 300. Project level details for public access have not been provided since public access\nthrough OPA is still our planned approach and there will be no separate method of public access\n\n\n\n                                               Page 5 \n\n                                NARA\'s web site is http://www.nara.gov \n\n\x0cto Base records. The FY 2013 request is under an OMB embargo until it is delivered to Congress \n\non February 6. \n\n\nThe recently signed FY 2012 budget marked the first time that ERA funding was provided as an \n\noperations and maintenance (O&M) line within NARA\'s OE account. In the course of \n\ndiscussions that led to passage of that budget, House and Senate Appropriations Committees \n\nstaffs were briefed on the O&M contract. There were no discussions about additional funds \n\nabove the FY 2012 request; rather, the most recent discussions dealt with how much the \n\nCommittees could cut ERA O&M below the request without significant disruptions to the \n\nprogram. \n\n\nQuestion 6. \n\nIf someone requests access to an unstructured base ERA record, if we can provide them the \n\nrecords, will it be in originallanguage/format or restructured in some manner? \n\n\nResponse: At this point in time we can only provide exact copies of unstructured electronic \n\nrecords (PDFs, HTML, Word, etc). The copies will be in the format as transferred by the agency. \n\nFormat migration is a desired enhancement expected to be developed at some point in the future. \n\n\nQuestion 7. \n\nIs there a charge for researchers who obtain structured or unstructured data from base ERA? If \n\nso, how is that charge defined? \n\n\nResponse: Normal fees would apply. The current schedule for reproduction orders, including \n\ncopies of digital files, is available here: http://www.archives.gov/research/order/fees.html. The \n\nNational Archives Trust Fund has set the cost-recovery fee for "born-digital" records, whether \n\nfrom Base ERA or legacy systems, at a $15 fee per file when the order contains 10 or fewer files \n\nand for orders of 11 or more files, the fee is $13 per file. There are no charges for electronic \n\nrecords available for download via ARCIOPA or for searching via AAD. \n\n\nQuestion 8. \n\nBehind all the complexity I simply want to know the answer to a fundamental question I have \n\nbeen asking for a decade which goes something like this. If a person wants to access a digital \n\nrecord that has been ingested into base ERA (and only resides there) how will they be able to \n\nlocate and access the record from their desktop or portable device? \n\n\nResponse: See response to Question 1. \n\n\n\n\n\n                                              Page 6 \n\n                               NARA\'s web site is http://www.nara.goY \n\n\x0c'